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^ ■ ABSTRACT 

CN ! We present a search for spatial and redsliift correlations in a 2 A resolution 

(^ ! spectroscopic survey of the Lya forest at 2.15 < 2; < 3.37 toward ten QSOs 

^ ', concentrated within a 1° diameter field. We find a simal at 2.7a significance for 

\Q , correlations of the Lya absorption line wavelengths between different lines of 



sight over the whole redshift range. The significance rises to 3.2cr if we restrict 
the redshift range to 2.60 < z < 3.37, and to 4.0o" if we further restrict the 
Qs^ [ sample to lines with rest equivalent width 0.1 < Wq/A < 0.9. We conclude that 

r-| ! a significant fraction of the Lya forest arises in structures whose correlation 

J^l length extends at least over 30 arcmin (~ 26 h~^ comoving Mpc at z = 2.6 for 

O ; Ho = lOOh km s"^ Mpc"\ il = 1.0, A = 0). We have also calculated the three 

^ [ dimensional two point correlation function for Lya absorbers; we do not detect 

' any significant signal in the data. However, we note that line blending prevents 

us from detecting the signal produced by a 100% overdensity of Lya absorbers 
r> ' in simulated data. We find that the Lya forest redshift distribution provides 
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a more sensitive test for such clustering than the three dimensional two point 
correlation function. 



Subject headings: cosmology: large-scale structure of universe - cosmology: 
observations - galaxies: quasars: absorption lines - galaxies: intergalactic 
medium 



1. Introduction 



Much progress has been made in understanding the origin of the numerous narrow Lya 
absorption lines observed in quasar spectra since their discovery ( |Lynds 1971| ). Their large 
number density along a typical line of sight ( ^argent et al. 1980|) shows a strong evolution 
with redshift, outnumbering any other known object ( [Lu, Wolfe, fc Turnshek 1991| ; [Bechtold 



1994|) for redshifts accessible from the ground {z > 1.6). Comparison between spectra for 



each component of multiply lensed quasars ([Foltz et al. 198^ ; [Smette et al. 1992| , 1995) 
or close quasar pairs ( Pechtold et al. 1994| ; pinshaw et al. 1994| , 1995, 1997; [Fang et aT] 
1996| ; P'Odorico et al. 1998| ; [Petitjean et al. 1998| ) indicate that they are produced in large 



tenuous clouds with diameters exceeding 50 h~^ kpc {Hq = lOOh km s~^ Mpc~^). The 
high signal to noise ratio spectra obtained with the 10m Keck telescope have revealed the 
presence of C iv absorption lines associated with 75% of the lines with column densities 
Nm > 3 lO^^cm-2 and 90% of the ones with A^'hi > 1.6 lO^^cm^^ QSongaila fc Cowie 1996|) . 
Furthermore, the lines with A^hi > 1-6 lO^^cm"^ show only an order of magnitude range 
in ionization ratios. These observations indicate that, although of low metallicities, these 
clouds are not made of pristine material, which in turn suggests the existence of a very first 
generation of massive stars contaminating the intergalactic medium with heavy elements 
( pVLiralda-Escude fc Rees 1997 ) as they turned supernovae. 



A possible association with galaxies which could also explain their metal content as 
processed gas is unclear ( [Morris et al. 19"93| ; [Morris fc van den Berg 1994| ; [Mo fc Morris 
[1994[ ; pijanzetta et al. 1995[ ; Powen, Pettini, fc Blades 1996| ; [Le Brun, Bergeron, fc Boisse| 
1996[ ; [van Gorkom et al. 1996[ ; [Ranch, Haehnelt, fc Steinmetz 19"97| [Chen et al. 1998| [Tripp 



Lu, fc Savage 1998[ ; [Grogin fc Geller 1998[ ; [Impey, Petry, fc l^lint 19991) . At low redshift, 
where the detection of galaxies is fairly complete to low-luminosity, a consensus appears 
that the largest column density Lya systems are distributed more hke galaxies than the 
low column density ones {log Nhi ^14 — 15 or rest equivalent width Wq ^ 0.1 — 0.3 A; see 
[S huU et al. 19^ ). 
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There is evidence that lower column density systems also correlate with galaxies 
( [Iripp, Lu, fc Savage 199^ ; [Impey, Petry, fc Flint 199"^ ), though Impey et al. found that 
it is not possible to assign uniquely a galaxy counterpart to an absorber, and that there is 
no support for absorbers to be located preferentially with the haloes of luminous galaxies. 
Extrapolations of these Lya cloud-galaxy correlations to z ~ 2 — 3 (in the general context 
of galaxy and density perturbation distributions) are consistent with observed Lya cloud 
clustering properties, which have only revealed signals on very small velocity scales ( |Webb| 
19871 ICristiani et al. 1995| , 1997; [Meiksin fc Bouchet 1995| ; |Cowie et al. 1995| ; [Songaila fc 



Cowie 1996|; [Fernandez- Soto et al. 19961). 



There has been much effort made to examine the two point correlation function at 
2; > 1.6 in the Lya forest along isolated lines of sight, with some contradictory results. On 
one side, Cristiani et al. (1997) found a 5a detection at Av < 300 km s~^ and Khare et al. 
(1997) detected a > 3cr signal at At> = 50 — 100 km s~^, based on 4m-class telescope echelle 
spectra, and Kim et al. (1997) presented a 2.5 — 2. 80" significance signal at Av = 75 km s~^ 
based on 10m Keck-HIRES data. On the other, Kirkman & Tytler (1997) failed to confirm 
such claims with high signal to noise ratio Keck-HIRES spectra. 

A complementary approach is to examine structure between adjacent lines of sight. 
For the Lya forest on small scales, Crotts (1989) and Crotts & Fang (1998) searched for 
spatial structure in the Lya forest &X z < 2.6 on angular scales of 2 < A^/arcmin < 3 
separation. They found that the two point correlation function presents an excess for 
velocity separations of Av ~ 200 km s~^ for W<^ > 0.4 A absorbers, with a tentative 
conclusion that Wq > 0.4 A absorbers are sheet-like. Their results indicate the existence of 
coherent structure on scales ~ 0.7 h^^ comoving Mpc at z > 2. At lower redshift, Dinshaw 
et al. (1995) found evidence of clustering on the scale of 100 km s^^ at 0.5 < z < 0.9 over 
a separation of A^ = 1.4 arcmin (~ 350h^^ kpc), though it is not clear whether this scale 
probes the same clouds or is more characteristic of a correlation length. Theoretically, 
McDonald & Miralda-Escude (1999) calculated the correlation function in three dimensions 
for the Lya forest between lines of sight separated by A9 < 5 arcmin, suggesting that such 
a method could be used to measure cosmological parameters {Qq, ^a)- 

On larger scales, correlated C IV absorption between lines of sight separated by several 
tens of arcmin has already revealed structures on the scale of several Mpc ( [Williger et al 



1996 , hereafter Paper I; Dinshaw fc impey 1996 ). A marginal correlation in the Lya forest 



has been suggested on the 1° scale ( [Pierre et al. 199111 ). Otherwise, Lya forest correlations 
on large angular scales have remained largely unexplored. 

A parallel analysis of the same south Galactic pole spectra as used here was carried out 
by Liske et al. (1999). It uses a new method to study correlations based on the statistics 
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of transmitted flux. Their results reveal the C IV cluster at 2; ~ 2.3 which was found in 
Paper I, as well as a void toward four lines of sight at least 36 x 24/i~^ comoving Mpc^ 
in extent at z = 2.97 at the Aa significance level. The void happens to coincide with the 
location of a nearby QSO. 

On the theoretical side, several recent N-body simulations performed in boxes of 
10 - 20h~^ comoving Mpc (|Cen et al. 1994| ; Petitjean, Miicket fc Kates 1995| ; [Hernquist 



8t al. 19961 ; [Miicket et al. 199q ; |Cen fc Simcoe 1997| ; [Zhang et al. 1997| , 1998; [Dave et al. 



1999[) and an analytical study ( [Bi fc Davidsen 1997[ ) suggest that Lya clouds are associated 
with filaments or large, flattened structures, similar to Zel'dovich pancakes, associated 
with low overdensity of the dark matter distribution {Sp/{p) < 30). Such structures may 
be detectable as correlations in the Lya forest toward groups of adjacent QSOs. Detailed 
analyses for the observable effects of such ~ 30h^^ comoving Mpc scale structures toward 
groups of QSOs on the sky have not been performed, though much smaller scales (< 0.56h^^ 
comoving Mpc) have been considered ( [Charlton et al. 1997[) . 



In this paper, the Lya forest toward ten 2.36 < z < 3.44 QSOs concentrated within a 
1° diameter field near the SGP is used as a probe for the existence of ~ 30h^^ comoving 
Mpc scale structures transverse to the lines of sight. In §2, we review the observational 
data used in the analysis. In §3, we describe the statistical tests made, and the results we 
obtained. We discuss the implications of the correlations we find in §4. Throughout this 
paper, we assume f2 = 1.0 and A = 0. 



2. The Data 

The observational data consist of the 10 highest signal to noise ratio spectra covering 
the Ly-a forest that were obtained during a parallel study (Paper I) on the large scale 
structure revealed by C IV absorbers, in front of 25 QSOs at 2; > 2 within a ~ 1° diameter 
field. The location of the QSOs on the sky and details of the observations and reductions 
are presented in Paper I. The instrumental resolution was ~ 2 A, which allows us to resolve 
lines with a velocity difference of Av > 140 km s~^ at z = 3. The signal to noise ratio 
per 1 A pixel reaches up to 40 between the Lya and Ly P emission lines. Further details 
of the observations and reductions are given in Paper I. We stress the homogeneity of the 
instrumental set-up, of the reduction process and of the fine list preparation procedures. 
The mean la uncertainty in wavelength centroids is a^ = 13 km s~^. 

The sample used for the analysis contains all the Lya lines with rest equivalent widths 
Wo > 0.1 A detected at the 5a significance level and which lie between the Lya and 
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Ly/5 emission line wavelengths. However, we have excluded lines that belong to known 
metal absorbers (cf. Paper I) or lie within 5000 km s~^ from the background QSO. The 
latter condition has been set to avoid uncertainties introduced by the "proximity effect" 
(which reduces the line number density and the equivalent widths of the lines) . The line 
sample is thus a subset taken from Table 2 of Paper I, and consists of 383 Lya lines at 
2.15 < z < 3.26. However, we note that completeness is only reached over the whole set of 
spectra for lines with Wq = 0.5 A. It is now generally accepted that there are very few Ly-a 
lines with 6 < 18 km s~^ lines ( |Hu et al. 1995| ; [Kirkman fc Tytler 1997] ). Consequently, 



all lines with Wq > 0.1 A observed here have log A^ni/cm^^ > 13.4. Figure |l| presents the 
locations and rest equivalent widths of the absorbers projected onto the right ascension and 
declination planes. 

Each individual line of sight shows no unusual distribution of absorption systems 
with redshift or rest equivalent width. We used the program SEWAGE (Sophisticated and 
Efficient W* And Gamma Estimator), kindly provided to us by Dobrzycki (1999). The 
redshift number density of Wq > 0.5 A lines is consistent with a power law distribution 
( |Lu, Wolfe, fc 'lurnshek 1991|) dM /dz oc (1 + z)''', with a maximum likelihood value 
7 = 2.02 ± 1.24. The corresponding rest equivalent width distribution is consistent with 
dM/dW oc e"^/^*, W* = 0.43 ± 0.04. We find no large voids in any line sample defined 
by a minimum rest equivalent width, following the method described in Ostriker, Bajtlik, 
& Duncan (1988). 

The instrumental resolution of ~ 2 A makes the minimum detected separation in 
a single spectrum between lines of strongly different strengths to be ~ 3.7A, which 
corresponds to 1.0-1.7 h~^ comoving Mpc along the line of sight over 2.156 < z < 3.258. 
The two closest lines of sight in our sample are 6.1 arcmin apart (toward Q0042-2656 and 
Q0042-2657, whose common Lya forest coverage lies in the range 2.653 < z < 2.836) 
or 5.2 h^^ comoving Mpc in the plane of the sky. Therefore, despite the unprecedented 
number of close lines of sight, we do not expect that our study would a priori bring any 
new result for structure on small scales: on the one hand, a few high signal to noise ratio 
spectra - already existing in the literature - would be sufficient to reveal structures larger 
than 5 h~^ comoving Mpc (e.g. [Cristiani et al. 1997|) , and on the other, spectra of close 



quasar pairs have already been studied to search for (and find) structure at the < 1 Mpc 
scale (cf. ICrotts fc l^ang 1998|) . 



The angular separation between any two lines of sight actually ranges from 6.1 to 69.2 
arcmin {5.2h~^ to 52.7h^^ comoving Mpc). Between two and seven lines of sight probe any 
given redshift over 2.155 < z < 3.258. The Lya forest spectral coverage corresponds to a 
region with a depth of 470 h^^ comoving Mpc along the line of sight. 
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Fig. 1. — Above: Solid lines show the redshift coverage for Lja in the QSO data in the 
sample, projected onto the declination plane and centered at right ascension a = 0^44™42!15 
(J2000). Open stars indicate QSOs used in the sample from Paper I, with their names being 
listed adjacent. Spectral regions within 5000 km s^^ from the background QSOs are not 
used, to discard absorption systems possibly affected by the proximity effect. Absorption 
lines with Wq > 0.1 A at > 5a detection significance were used in the analysis, and are 
shown as ticks whose length is proportional to Wq. Absorbers corresponding to known metal 
systems from the C iv survey have been omitted. Below: Same as above, except projected 
onto the right ascension plane and centered at declination 6 = — 27°25'06" (J2000). 
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3. Statistical Analysis 

The following subsections describe the statistical analysis we performed to search for 
structures in the Lya forest spanning two or more lines of sight. In § ^.1| , we detail the 
construction of random control samples of Lya forest spectra and line lists, which are free of 
correlations. The control samples will be used to determine the significance of any features 
we find in various forms of the Lya forest correlation function and redshift distribution. 
We then test for structures extended in three dimensions (§p.2|) and in the plane of the sky 



PD. 



3.1. Creation of random control samples 

We realize that the spectral resolution and signal to noise ratio of the spectra are 
barely adequate for the study of the dense Ly-a forest at 2; ~ 3. In addition, the specific, 
irregular arrangement of detection windows in redshift space and lines of sight could create 
a subtle pattern of aliasing on large scales, comparable to the separation between lines of 
sight and the extent that each spectrum probes along the line of sight. To overcome these 
difficulties, we created control samples free of correlations between absorbers. These control 
samples should have characteristics as similar as possible to the observed spectra. This 
section describes how we reached that goal. 

We simulated the data directly from the Lya absorber distribution functions in H I 
column density and Doppler parameter as recently determined using Keck-HIRES spectra 
( [Kim et al. 1997[ ). Since they provide the distribution characteristics for 3 different mean 
redshifts, interpolation functions are needed to accommodate the fast redshift evolution of 
the different parameters involved. 

We find that the following functions described well the data in the redshift range 
2.3 < z < 3.25. Their validity is doubtful outside this range. The number density of Ly-a 
clouds per unit absorption length X and per unit column density A^hi is given by: 

^'^ M N,r. (1) 



dNui dX 

where 

log/(^) = -0.34 + 2.0 (1 + z), (2) 

and 

/?(z) = 0.8666 (1 + z)°-=^«°^ (3) 



In order to reproduce the break in the column density distribution observed at z < 2.7 
( [Kim et al. 1997| ; cf. also [Petitjean et al. 1993| ), we randomly eliminated 75% of the z < 2.7 



lines with logA^ni > 14.3. Although the data show a more gradual break with redshift, we 
find that this simple method is good enough for our purpose to produce control samples. 

The Doppler parameter distribution is described by a Gaussian with a cut-off at low b 
values: 

f— (4(^)1 — M.) (4) 

= 0, otherwise. 

The Doppler parameters depend on z in the following way: 

b{z) = 64.6 (1 + z)-°-^^ km s"^ (5) 

ay,{z) = 96.0 (1 + z)-'^-^^ km s"^ (6) 

The low cut-off value bc{z) = 15 km s~^, independently of redshift. 

A random process is used to determine the wavelengths of the lines so that their 
distribution is Poissonian in redshift. The mean redshift density is set to be equal to the value 
expected by integrating equation (^ over the column density range 12.5 < logA^Hi < oo 
at the given z; we use the relation dX = {1 + z) dz, which is valid for the value q^ = 
adopted by Kim et al. (1997), so that the simulations are independent of the cosmological 
parameters. Values for the column density and the Doppler parameter b were then 
independently attributed to each line, following the distributions described above. 

Given the redshift, column density and Doppler parameter for each line (the input 
line list), Voigt profiles can be calculated. The resulting high-resolution spectrum is then 
convolved with a Gaussian point spread function (PSF) of 2 A FWHM, and rebinned so 
that its sampling is equivalent to the corresponding individual QSO spectrum which it 
simulates. Photon and read-out noise have been added so that the final spectrum has a 
signal to noise ratio comparable, at each wavelength, to that of the corresponding observed 
spectrum. 

Such a method naturally accounts for cosmic variance. 

We used the same software to search automatically for absorption lines in the simulated 
spectra as we did for the observed data (cf. Paper I). We derived for the simulations and 
the data the redshift distribution index 7 defined by dN /dz oc (1 + 2)'^, rest equivalent 
width distribution index W* defined by dJ\f/dW oc e~^^^* and the distribution of the total 
number of lines with rest equivalent width A^(> Wq). We also compared 7 and W* to values 
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from Bechtold (1994). Her total sample consists mainly of her medium resolution sample 
(spectral resolution generally between 50 and 100 km s~^ FWHM, 5a Wq detection limit, 
weighted by redshift coverage, of 0.172 A), which is higher resolution and only marginally 
noisier than ours (~ 120 — 160 km s~^ FWHM, mean 5a Wq detection limit of 0.165 A). 
The redshift index 7 for the simulations agrees well with the values from Bechtold and 
the independence of 7 with resolution (Parnell & Carswell 1988), and is consistent with 
our observed data (Figure ^. The rest equivalent width index W* for the simulations is 
consistent with our data (Figure H). The Bechtold data indicate a larger number of low Wq 
lines relative to high Wq lines than in our sample, as expected for the difference in spectral 
resolution. The distribution of the total number of lines A^(> Wq) is very consistent with 
the simulations (Figure^). 

We used between 100 and 1000 synthetic spectra as controls for the calculations in this 
paper. 



3.2. Search for structures extended in three dimensions 

We used two different methods to search for correlations of Lya forest absorbers in 
three dimensions: (1) the two point correlation function, and (2) the redshift number 
density. 

1. Two point correlation function: We constructed the two point correlation function 
in three dimensions as in Paper I. However, we did not use the estimator DR of Davis 
& Peebles (1983), in which the observed data (D) are cross-correlated with a randomly 
generated data set (R) to provide the normalization for the distribution for the absorber 
pairs. The absorption line density is so high in the higher redshift portion of our data that 
we would not be able to detect absorber separations much smaller than observed. Rather, 
we used the DD/RR estimator, and computed the average and first moment about the 
mean of the two point correlation function directly from the simulations. We found no 
significant signal in the observed data, nor in any subset of the data as a function of redshift 
or rest equivalent width detection threshold. 

We performed a heuristic check that our algorithm would indeed reveal any clustering, 
by creating an artificial cluster in the simulated data. The artificial cluster was produced 
by adding a 100% overdensity of absorbers in the redshift range 2.700 < z < 2.765 into the 
input line list used for a set of simulated spectra; these absorbers are common and identical 
for all quasars of a given set. Their characteristics {z, A^hi, b) were obtained following the 
same procedure that produces the input line list described above. The redshift range is 
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Fig. 2. — The maximum likelihood estimate of the power law index 7 vs. rest equivalent 
width threshold Wq, with la error bars. Open circles show the observed data and triangles 
the mean of 1000 Monte Carlo simulations (offset by Az = —0.02 for clarity); squares 
represent the results of Bechtold (1994) for the first three samples in her Table 4 (absorption 
z < zqso — 0.15, medium and low resolution samples combined). 
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Fig. 3. — The maximuin likelihood rest equivalent width distribution index W* vs. rest 
equivalent width detection threshold Wq. Open circles show observed data and triangles 
the mean of 1000 Monte Carlo simulations (offset by AWq = —0.02 A for clarity). Squares 
represent the results of Bechtold (1994) for the first three samples in her Table 4 (absorption 
z < zqso ~ 0.15, medium and low resolution samples combined). Error bars indicate la 
uncertainties. The lower values of W* for the Bechtold data likely arise from the higher 
resolution of her spectra. 
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Fig. 4. — The cumulative number of lines greater than rest equivalent width A^(> Wq) for the 
data (open circles) and simulations (triangles), with la uncertainties marked by error bars, 
for the redshift range 2.15 < z < 3.26. The simulations have been offset by AWq = —0.02 
A for clarity. 
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the best-sampled one in our data, as it is probed, at least partially, by six QSO siglitlines 
for which we have high signal to noise ratio spectra. This artificial cluster would cover 
approximately 35 x 20/i~^ comoving Mpc^ on the plane of the sky, and span 25/i~^ comoving 
Mpc along the line of sight; it would be as extended along the line-of-sight as the C iv 
groups described in Paper I, and slightly wider on the plane of the sky. 

We produced 100 simulated sets of spectra with such an artificial cluster, and 
determined the corresponding three dimensional correlation function in the same way as 
for the observed data. Due to line blending, the mean number of "detected" lines in the 
artificially overdense region increases only by 27%, compared to the mean number of lines 
in the simulated spectra with no artificial cluster. However, the mean rest equivalent width 
of the lines in that interval increases by 41% from (VFo.random) = 0.66 A to (M^o.ciuster) = 0.93 
A, but is not detected significantly since the first moment about the mean is a{Wo) ~ 0.6A 
in both cases. 

Similarly, we created artificially clustered data sets with overdensities of 25% and 
100% at 2.200 < z < 2.300. We do not find any evidence for a significant signal in three 
dimensional two point correlation function for any of the artificially correlated cases. 

2. Redshift number density: For a second test, we computed the redshift distributions 
of the observed absorbers toward all lines of sight in our sample. We compare the 
observed number of absorbers dAfohs/dz with the expected mean and first moment about 
the mean from the simulations dJVcxp/dz, aldj^cxp/dz), to define a significance level 
SLdM/dz = {dAfohs/dz — dAfcxp/dz)/a{dAfe:>ip/dz). The data produce no significant features 
in dAf/dz for a variety of rest equivalent width detection thresholds (Figure ^. The most 
significant feature is an overdensity of lines {SLaj\f/dz ~2)at2.2<2;<2.3 which is 
strongest when weak lines {Wq > 0.1 — 0.4 A) are included in the sample. This redshift 
range partially includes the one covered by a group of C IV absorbers found in Paper I, 
whose corresponding Lya lines have already been removed from our sample. We therefore 
find no significant evidence for an overdensity of absorbers in a given redshift interval in the 
observed data. 

To test the sensitivity of the redshift distribution to the presence of clustered Lya 
lines, we also calculated dAf/dz for the three artificially clustered data sets. Only the 100% 
overdense case at 2.2 < z < 2.3 produces a detectable signal (at SLdM/dz ~ 3 — 4 level) in 
the line number density (Figure |]). 

We do not detect any significant structures extended in three dimensions in the 
observed data, and also conclude that the three dimensional two point correlation function 
is a less sensitive indicator of clustering in the Lya forest in three dimensions than the 
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range of rest equivalent width minimum values. The 68%, 95% and 99% confidence intervals 
are shown by the dark, medium and light grey shaded regions. 
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Fig. 6. — The open circles show the significance level SLaj^f/dz of the number of absorption 
lines in the redshift number density distribution dAf/dz of absorption lines at 2.70 < z < 
2.80, for a simulated 100% overdensity of lines at 2.700 < z < 2.765, for a series of rest 
equivalent width detection thresholds Wq. The first moment about the mean scatter is 
indicated by the vertical lines. The squares are for a similar 25% overdensity (offset by 
Wq = 0.02 A for clarity) and the triangles are for a 100% overdensity at 2.200 < z < 2.300 
(offset by Wq = —0.02 A). We used 100 simulated data sets with no overdensity as controls. 
At the resolution of our spectra, the largest effect of an overdensity is to increase the measured 
rest equivalent width of absorption lines, rather than to increase the number of detected lines. 
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redshift number density. 



3.3. Search for structures in the plane of sky 

In contrast to the two point correlation function in three dimensions, the two point 
correlation function in velocity space ^(Af ) has successfully revealed (apparent) clustering 
(as there is no way to separate out peculiar velocities) in the Lya forest between lines 
of sight on the scale of up to 3 arcmin ( Urotts fc Fang 1998| ). In order to extend the 



exploration of such correlations on scales up to 69 arcmin, we calculated ^(Af (Ai, Aj)), 
where At>(Ai,Aj) = 2 c^ (Ai — Aj)/(Ai + Aj) is the velocity difference between two lines 
detected at Ai and Aj in the spectra of two different quasars, where Cs is the speed of light. 
We computed the first moment about the mean cr(^) directly from the simulated control 
sample line lists. 

A 50 km s~^ bin size was chosen, which is more than 3 times larger than the typical 
error on the determination of the observed lines centroid. It is large enough so that each bin 
contains on average at least 67 lines drawn from the control sample. This test is sensitive 
for structures with large transverse extent in the plane of the sky, but not necessarily large 
extent along the line of sight. 



3.3.1. A significant signal at 2.60 < z < 3.26? 

If we include the entire redshift range of our data sample, the most deviant feature 
of the two point correlation function ^(Af ) is a 2.7(t(^) excess of pairs with Wq > 0.1 A 
Lya absorption lines and velocity differences 50 < Af /(kms~^) < 100 (Figure |^). We then 
divided the sample using criteria based on the redshift, rest equivalent width and angular 
separation between the different lines of sight to search for the origin of this possible signal. 

Limiting the line redshifts to the range 2.60 < z < 3.26 reveals an overdensity of line 
pairs significant at the 3.2o"(^) confidence level (Figure ^ and Table p: our observations 
provide 79 pairs of Wq > 0.1 A lines in the 50 < At'/(kms^ ) < 100 velocity bin, an excess 
of 42% compared to a mean and la dispersion of only 55.6 ± 7.4 derived from the control 
sample. The probability of finding such an excess in any bin is P = 0.0012. Surprisingly, 
we do not detect any significant signal at velocity splittings Av < 50 km s~^, a point which 
we will examine in detail later ( §3.3.5| ). 



The low redshift range 2.1Q < z < 2.60 does not present any significant signal at any 
velocity splitting, but the small number of expected line pairs per velocity bin (3.1 to 7.3, 
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Fig. 7. — The two point correlation function $,{Av) for the observed hnes, with rest equivalent 
width threshold Wq > 0.1 A and covering 2.15 < z < 3.26, the entire redshift range over 
which there are data for at least two lines of sight. Only pairs between different lines of sight 
are counted. The dark, medium and light shadings signify the 68%, 95% and 99% confidence 
limits determined by 1000 Monte Carlo simulations as described in the text. 
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Fig. 8. — The two point correlation function ^{Av) for the observed hnes, with rest equivalent 
width threshold Wq > 0.1 A and confidence limits as in Figure |^ but for 2.60 < z < 3.26. 
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Table 1. Redshift dependence of the feature at 50 < Af /km s ^ < 100 for Wq > 0.1 A. 



z range'^ 


sample size*^ 


-' 'oxp ^ C^cxp 


iVobs'^ 


SL^ 


2.16-2.40 


67 


7.3 ±3.7 


9 


0.5 


2.40-2.60 


43 


3.1 ±1.7 


3 


-0.1 


2.60-2.80 


86 


18.2 ±4.2 


26 


1.9 


2.80-3.00 


89 


19.6 ±4.0 


32 


3.1 


3.00-3.26 


98 


16.7 ±4.1 


21 


1.0 


2.60-3.26 


273 


55.6 ±7.4 


79 


3.2 



'^the redshift range used to define the subsample 

'^number of Lja hues in the given redshift range with 
Wo > 0.1 A 

■^the mean number and first moment around the 
mean of pairs with 50 < Af /km s~^ < 100 expected 
from 1000 Monte Carlo simulations 

'^the observed number of pairs 

%he significance level of the signal: SL = (A^obs ^ 

-' 'cxpj/ '^cxp 
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see first two rows in Table |l]) hinders the detection of all but the very strongest correlation. 



3.3.2. Rest equivalent width dependence 

In a similar way, we calculated ^(Af ) for subsamples of 2.60 < z < 3.26 lines based on 
their rest equivalent width. Table |] shows that the significance of the correlation is larger 
for low values of the minimum rest equivalent width threshold: it is quite strong (3.6o"(^)) 
for 0.1 < Wq/K < 0.7, and reaches 4.0cr(^) for 0.1 < Wq/A < 0.9. However, the significance 
of the signal rapidly decreases for increasing minimum values of Wq. If the real value of 
C, = 0.5, the small number of lines prevents the detection of signal at the 3a significance 
level for Wq > 0.4 A. Therefore, we can only conclude that the value of the correlation 
function does not increase strongly with the minimum equivalent width of the lines. 

We also investigated whether the pairs of Wq > 0.1 A absorbers within the 
50 < Af/kms~ < 100 bin tend to present similar equivalent widths. We define 
AWo,ij = \Wo^i — WqjI for lines i,j of the sample. The distribution of AWo,ij in that bin 
is not significantly different from that of the data as a whole. Therefore, the line pairs 
which produce the signal do not possess a significantly high proportion of pairs of lines with 
similar rest equivalent widths. 



3.3.3. Angular separation dependence 

We investigated the dependence of the signal strength on the angular separation 
A6'ij between the background quasars. The seven QSOs which contribute Lya lines to 
the sample at 2.60 < z < 3.26 have angular separations of 6.1 < A^/arcmin < 41.2. 
However, QSO 0041-2707 contributes only 9 of the 273 lines in the sample and provides a 
separation of A^ ~ 41 arcmin only with QSO 0042-2627 and QSO 0043-2633 for 3% of 
the line pairs; the other 97% of the line pairs come from QSOs with angular separation 
6.1 < A6'/arcmin < 31.2. We split the sample of line pairs (i.e., lines with Wq > 0.1 A 
and 2.60 < z < 3.26) into two parts of similar size: the lines detected in the spectra of 
quasars separated by A^ < 24 and A6 > 24 arcmin formed the small and large A^ samples, 
respectively. We find an excess of 8 line pairs (33 compared to 24.7 ± 5.0 expected) with 
50 < Af /km s^^ < 100 in the small A^ sample, a 1.7a{C,) overdensity. The large A^ sample 
produces a 2.8cr(^) excess of 15 line pairs (46 compared to 30.8 ± 5.4) in the same velocity 
bin. Therefore neither half of the sample produces a significant correlation on its own. 

However, two lines of sight (towards 0042-2627 and 0042-2656, A^ = 29.1 arcmin) 
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Table 2. Rest equivalent width dependence of the feature at 50 < Af /km s ^ < 100 for 

2.60 <z< 3.26. 



Wq range (A)*" sample size^ iVexp ± acxp'' iVobs*^ SL*" 



W^o 


>0.1 


273 


55.6 ±7.4 


79 


3.2 


W^o 


> 0.2 


251 


47.3 ±6.8 


70 


3.4 


Wo 


>0.3 


217 


35.8 ±5.9 


47 


1.9 


Wo 


> 0.4 


185 


26.4 ±5.3 


32 


1.1 


Wo 


>0.5 


155 


18.8 ±4.5 


22 


0.7 


Wo 


>0.6 


127 


12.7 ±4.0 


11 


-0.7 


0.1 < Wo 


<0.4 


88 


5.2 ±2.6 


8 


1.1 


0.1 < Wo 


<0.5 


118 


9.6 ±3.4 


20 


3.1 


0.1 < Wo 


<0.6 


146 


15.1 ±4.0 


24 


2.2 


0.1 < Wo 


<0.7 


165 


19.6 ±4.3 


35 


3.6 


0.1 < Wo 


<0.8 


181 


23.8 ±4.5 


40 


3.5 


0.1 < Wo 


<0.9 


200 


29.1 ±5.2 


50 


4.0 


0.1 < Wo 


< 1.0 


216 


34.1 ±5.6 


54 


3.5 



^the rest equivalent width range in A used to define the 
subsample 

'^number of Lya lines in the given Wo range 

■^the mean number and first moment around the mean of 
pairs with 50 < Af /km s~^ < 100 expected from 1000 Monte 
Carlo simulations 

'^the observed number of pairs 

'^the significance level of the signal: SL = (A^obs— A^cxp)/o"cxp 
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provide 27% of the total number of pairs in the 50 < Af /km s"^ < 100 bin, with 21 pairs 
observed while only 14.3 ± 1.6 are expected. They also are the QSOs which contribute most 
to the number of absorption lines in the z > 2.60 range. The large number of absorption 
systems toward each QSO, and the overabundance of pairs between them, lead us to 
suspect that perhaps each of these 2 spectra were contaminated by metal line systems from 
absorbers which, coincidentally (or not) lie at the same redshift, but for a velocity difference 
of 50 < At>/km s~^ < 100. We have investigated this possibility but were forced to reject it 
for two reasons. (A) If the signal would actually come from metal lines, they would appear 
clustered in redshift, either because some of them would be doublets (Mg ll, Al ill, C iv) 
or have recognizable line separations (e.g. lines from Fe I, Fe ll, etc.). We do not see this 
effect: on the contrary, the lines contributing to the signal are well-spread over the whole 
common redshift range. (B) We searched for additional heavy element systems and found 
candidates for C iv or Mg ll doublets; however, they do not constitute a large number of 
lines. Higher resolution spectra would be needed to eliminate possibility (B) definitively. 



3.3.4- Nearest neighbor distribution 

In order to confirm the excess of line pairs with 50 < Af /km s~^ < 100, we also 
calculated the nearest neighbor distribution NN{Av) and its first moment about the 
mean a{NN), which provides a more sensitive test for correlations at small separations 
than the two point correlation function: it reveals a 2.9a{NN) overdensity of line pairs 
at 50 < Af/km s~^ < 100 (Figure P). The Kolmogorov-Smirnov (KS) test, which is 
independent of the velocity binning, indicates the likelihood of the observed data being 
consistent with the random control sample to be P = 0.0001. We also computed the 
variance V of the nearest neighbor function NN{Av) for each simulation j over the velocity 
bins Avi, 

V,.^iNNiM,..)-iNNiA..W^^^i^ (7) 

where the means are taken over all simulations j (Figure [1^). The observed variance is 
exceeded by that of the simulations in 36 cases out of 1000. The KS test and the distribution 
of variances indicate that the probabihty of the observed nearest neighbor distribution 
arising from a random distribution of absorbers is small, though the exact probability is 
difficult to determine because the estimates between the two methods differ by a factor of 
360. A better understanding of the signal can be obtained with a more detailed model, 
which we describe in the next section. 
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Fig. 9. — The nearest neighbor distribution for the observed hnes {Wq > 0.1 A) at 
2.60 < z < 3.26. The dark, medium and hght shadings signify the 68%, 95% and 99% 
confidence hmits determined by 1000 Monte Carlo simulations as described in the text. 
There is a 2.9o" overdensity of line pairs at 50 < Av < 100 km s~^. 
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Fig. 10. — The distribution of variances in the nearest neighbor distribution A^A^(At;) for 
1000 Monte Carlo simulations. The dashed line indicates the variance for the observed lines, 
which is exceeded by 3.6% of the Monte Carlo simulations. 
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3.3.5. A model for the signal 

The origin of the correlation is mysterious if the result is taken at face value, that is, 
if the excess of pairs really occurs at 50 < Aw /km s~^ < 100, and does not actually peak 
at < At; /km s~^ < 50. Indeed, all physically plausible models which we have considered, 
i.e. sheets, filaments or any other connected structures spanning 30 arcmin on the sky, that 
would give a signal between 50 < Af /km s~^ < 100 at large angular separation, would also 
give a signal between < Af /km s^^ < 50 at small A6'. Since the number of line pairs 
per bin is comparable at ~ 10, 20 and 30 arcmin, any such model would lead to a signal of 
similar strength in these two velocity bins. 

As the accuracy of the line wavelength is thought to be of the order 13 km s~^, if 
the 'real' correlation is actually at < Ati/km s"^ < 50, the effect of underestimating 
the accuracy of the line center measurement would a priori only broaden the peak of the 
correlation, not its centroid. However, the blending of several Lya lines due to the low 
spectral resolution can account for the observed Af of the correlation peak if the signal to 
noise ratio is not very good. 

In order to test this hypothesis, we modified the generation of the simulated spectra 
described in ^ p.lj Specifically, we changed the way the input line hst is produced before 
the creation of the Voigt profiles, in such a way as to introduce a correlation between the 
different lines of sight, while conserving the mean density of (input) lines per unit redshift. 

First, we produced a new input line list following the same parameters as the ones 
described in §|3.1|, but extending over the redshift range whose limits are the minimal and 



maximal redshifts covered by our spectra. Let us call this input line list the full-range line 
list, J^ = lJj=i„jFj, where Fj = (z,, A^Hi,i, &j)- Similarly, let us call the input line list for 
the spectra Sg = \Ji=i,n Sg^i, with Sg^i = (-2g,i, A^m.g.i, ^g.i), where q = I, ..., 10 identifies the 
quasar. 

Two additional input parameters are needed: c, which describes the percentage of 
input lines to be common to each new input line list, and o"c, which gives the velocity 
dispersion of the common line lists along the different lines-of-sight. 

The new input line list Afg for the quasar q is created in the following way. It is the 
union of two sets of lines. The first set of lines is common to each quasar. It originates from 
the full-range line list JF: each line Fi has a probability c to be included in each new line 
list Afg] the values of A^Hi,i and 6j are identical in each J\fg, but the redshift Zi is modified 
to include a peculiar velocity, whose value follows a Gaussian distribution with the velocity 
dispersion ac- The second set of lines is unique to each quasar and comes from the line lists 
Sg-. each line Sg^i has a probability (1 — c) of being included (as it is) in the line list Afg. 
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The rest of the procedure is then identical to the one described in §3.1. 



It is important to note that the percentage of common hnes in the input hne hst is not 



necessarily reflected in the 'observed' line list, as mentioned in § p.2| . The blending due both 
to the intrinsic width of the lines and to the 2 A spectral resolution (1) leads relatively 
weak lines to disappear into the wings of stronger lines, and (2) makes several lines with 
small wavelength separation appear as one. These effects are revealed in the results of the 
following test. 

We computed the value of the two point correlation function for different values of c 
and (Tc by creating 1000 simulations for each pair (c, o"c) considered. Figure ^ presents some 
of the results, expressed in confidence level in the second bin (50 < At>/(km s~^) < 100) 
vs. the confidence level in the first bin (0 < Av/(km s~^) < 50). In each case, the asterisk 
symbol indicates the values obtained with the observed data. 

The top left panel shows the results when no line is (arbitrarily) common between the 
different lines-of-sight: as expected, there are very few cases where a false positive signal is 
detected. In this set of 1000 simulations, there is only 1 case when a value is larger than 
3(j(^) in either of the two bins. We also note two other false positive cases of negative 
correlation, also at greater than the 3cr(^) level. 

If 10% of the lines are common (c = 0.10) to each line-of-sight with a velocity dispersion 
of cTc = 50 km s~^ (top middle panel), then the correlation is detected at the 3(t(^) level 
in at least 1 of the two first bins in only about 4% of the cases, with the detection in the 
2nd bin alone accounting for 25% of these cases. Even if 20% of the lines are common with 
o"c = 50 km s~^, the correlation function does not consistently show any significant signal: 
moreover, the quadratic sum of the significance level in the two first bins does not reveal 
any signal at more than 3cr(^) in 64% of the cases. 



Table 3. Percentage of cases that the significance level of the signal is larger than 3.2(T in 

at least one of the first 3 bins. 

c : 0% 10% 30% 40% 50% 60% 70% 70% 80% 

ac(km s-i) : 50 100 150 200 200 200 250 250 



0.2 2.5 7.4 7.0 8.7 17.2 33.8 22.2 40.2 
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Fig. 11. — The overdensity of line pairs in the second (50 < Af /(km s~^) < 100) bin vs. 
the first (0 < Af/(km s^"*^) < 50) for the two point correlation function ^(Af), expressed in 
units of the first moment about the mean for an average of 1000 Monte Carlo simulations, 
(j{^). A series of models were calculated, with varying percentages of artificially correlated 
lines c and velocity dispersion Cc- The asterisk symbol indicates the value derived from the 
observations. 
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As can be seen in the different panels, the effect of increasing the number of common 
hnes c is to move the set of points approximately along the diagonal of equal confidence 
limits towards larger values, while a larger velocity dispersion increases their spread 
perpendicular to this direction and reduces the number of bins with significant signals. 
These simulations, whose results are summarized in Table ^, show that it is unlikely that 
the signal that we detect is a statistical fluke; on the contrary, it probably indicates that a 
significant number of lines are common between the different lines-of-sight with a velocity 
dispersion probably larger than 100 km s~^. Unfortunately, the limited spectral resolution 
of our data does not allow us to quantify this result better. 



4. Discussion 

We have searched for correlations among a sample of 383 Wq > 0.1 A {5a detection 
threshold) Lya absorbers ranging over 2.15 < z < 3.26 in front of 10 QSOs separated 
by 6.1 < A0/arcmin < 69, an angular separation an order of magnitude greater than for 
any other study for more than a simple pair of QSOs. Our statistical tests have consisted 
of the three dimensional two point correlation function, the redshift distribution dAf/dz, 
and the two point correlation function in redshift space. We have found no evidence for 
clustering in the the three dimensional two point correlation function, and no anomalies in 
the absorber redshift distribution dj\f/dz. In fact, we find that the three dimensional two 
point correlation function is less sensitive to clusters of Lya absorbers than dJV/dz. 

We have calculated the two point correlation function in velocity space and find a signal 
of ^(Af ) = 0.35 with significance 3.2cr(^) at velocity separation 50 < Af /(km s~^) < 100 
for a subsample at 2.60 < z < 3.26 and Wq > 0.1 A. Its significance rises to 4.0cr(^) if 
the rest equivalent width is restricted to 0.1 < Wq/A < 0.9, but tends to weaken with 
increasing minimum values of Wq. However, given the limited sample size, we do not draw 
any stronger conclusion than that the significance of the signal does not strongly increase 
with the minimum value of Wq- 

Additional simulations show that blending due to the low spectral resolution of our 
spectra may often destroy any signal even if most of the lines are common between the lines 
of sight. However, they also show that if any signal is detected in any of the first few bins, 
it is unlikely to be due to chance. Instead, such a signal very often reveals the presence 
of an underlying correlation. If the correlation that we find is only the strongest part of 
an underlying distribution, which may extend over a larger range in velocity space, then 
the analysis of a larger and higher resolution data sample should confirm the reality of the 
feature. 
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Physically, a correlation at such a small velocity dispersion could arise from the 
apparent collapse of structures along the line of sight (the "bull's-eye effect", [Pratonj 
IVlelott, fc McKee 1997| ; |Melott et al. 19^ ), reducing their apparent extent in velocity 



space. Furthermore, if such structures contain Lya absorbers on the scale of 10-30 arcmin, 
or 8.7-26 (13-40) h~^ comoving Mpc for Q = 1.0 (0.2), then density gradients within the 
structures could explain the difference between the clustering of strong absorbers that 
Crotts & Fang found on small angular scales, and of weaker absorbers which we find on 
larger scales. Overdensities and underdensities on the scale of a few tens of comoving Mpc 
have been identified along individual lines of sight (|Cristiani et al. 1997| ) , so similar features 
in the plane of the sky are plausible. 

Oort (1981, 1983, 1984) suggested that correlated Lya absorption on 0.5 — 1° scales 
could be the signature of high redshift superclusters arising from "pancake" formations. 
Simulations of the growth of cosmological structures ( Petitjean, Miicket fc Kates 1995 ; 



Hernquist et al. 1996| ; [Miicket et al. 1996| ; [Ranch, Haehnelt, fc Steinmetz 1997| ; [Zhan^ 



et al. 19^ , 1998) indicate that structures (e.g. filaments/sheets) of dark matter and gas 



extend up to several Mpc, forming a "cosmic web" ([Bond et al. 1996[). Such structures 



produce Lya absorption up to 7 comoving Mpc from luminous galaxies or groups of 
galaxies ( [Petitjean, Miicket fc Kates 1995[) . The detailed analysis of simulations can yield 



quantitative predictions for the Lya forest correlation function in the larger context of 
galaxy formation (on scales of ~ lh~^ Mpc, [(Jen fc Simcoe 1997[ ), and permit the recovery 



of power spectrum of density perturbations (on scales of up to llh^^ Mpc, [Croft et "al 
[1998[ ), though at present, the small size of the simulation boxes does not permit similar 
predictions on the scale probed by our data. Our observations indicate that structures 
coherent over more than 7 comoving Mpc may well exist in the Lya forest at z ~ 3. As 
simulations become more advanced and box sizes increase, it will be possible to compare 
model structures to those of the scale we find in our data. 

The correlations of the Lya lines in velocity space imply large scale structure extending 
over 30 arcmin , or about 26 (40)/i~^ comoving Mpc for Q = 1.0 (0.2). The comparison 
between Lya absorbers on such wide angular scales provides a unique tool to probe the 
evolution of large scale structure at high redshift. With 8-lOm class telescopes, it will be 
possible to survey fainter QSOs, which would provide a much higher density per unit area 
on the sky and thus enable a much more detailed probe of the correlation behavior of QSO 
absorption systems. It will also be possible to detect routinely bright galaxies in the vicinity 
of such correlated absorbers, to reveal more details about the relationship between the two 
sorts of objects and to the distribution of matter in general. 

We appreciate useful conversations with A. Crotts, V. Icke, V. Khersonsky, J. Liske, P. 
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